Search for fermentation metabolites in MSV000084900

Author

Evoquant LLC

Introduction

The fermented foods residency group at the Astera Institute wanted to search for a set of metabolites in an existing public dataset: MassIVE MSV000084900 GNPS Global Foodomics dataset 3500 (“Metabolite analysis of 3,500 food and beverage samples, ethanol extraction. Data were acquired using a Bruker Daltonics maXis Impact and C18 RP-UHPLC. Positive polarity acquisition of LC-MS/MS.”). Prior to this analysis we evaluated that list for likelihood of detection in the dataset, and found that almost all of the metabolites of interest were low molecular weight and polar which would lend to less retention on C18 columns (potential to elute before MS scans begin), and many were much more likely to be detected in negative ion mode than positive mode (dataset collected in positive mode only). It was decided to proceed to search for the metabolite of interest and this notebook is the result of that search for these analytes across the 3,500 mzXMLs using the GNPS2 platform.

Click to show table of Metabolites of Interest
Metabolite of Interest InChI Key PubChem CID Formula Monoisotopic Mass Chemical Structure GNPS Library Entries (Positive) GNPS Library Entries (Negative) All Adducts (Positive) All Ion Sources (Positive) All Instruments (Positive) All Compound Sources (Positive)
hyodeoxycholic acid DGABKXLVXPYZII-SIBKNCMHSA-N 5283820 C24H40O4 392.2927 19 38 M-H2O+H, M+H, 2M+H, 2M+Na, M+Na, M-2H2O+H, M+K, M+H-H2O ESI Orbitrap, qTof Isolated, commercial, crude, Commercial
Conjugated linoleic acid JBYXPOFIGCOSSB-XBLVEGMJSA-N 5282796 C18H32O2 280.2402 8 15 M+H, M+H-H2O ESI Ion Trap, qTof Isolated
Indole-3-lactic acid XGILAAMKEQUXLS-UHFFFAOYSA-N 92904 C11H11NO3 205.0739 52 29 M+H, M+K, M+H-H2O, M-H2O+H ESI, Positive, LC-ESI Quattro_QQQ:40eV, Quattro_QQQ:25eV, Ion Trap, qTof, Orbitrap, Quattro_QQQ:10eV Isolated, isolated, Commercial, NIH Natural Product Library
Tryptophan QIVBCDIJIAJPQS-VIFPVBQESA-N 6305 C11H12N2O2 204.0899 221 160 M+H, [M+H-H2O]+, [M+Na]+, [M+K]+, [M+H]+, 2M+H, [M+H], M+Na ESI, DI-ESI, Positive, LC-ESI QQQ, Q-Exactive Plus, LC-ESI-ITFT, Quattro_QQQ:25eV, Ion Trap, Quattro_QQQ:40eV, qTof, Flow-injection QqQ/MS, Orbitrap, Quattro_QQQ:10eV, Hybrid FT, LC-ESI-QTOF, LC-Q-TOF/MS Other, Isolated, Commercial standard, isolated, Crude, Commercial
Indopropionic acid GOLXRNDWAUTYKT-UHFFFAOYSA-N 3744 C11H11NO2 189.079 39 49 M+H, M-H2O+H, M+H-H2O ESI, Positive, LC-ESI Quattro_QQQ:40eV, Quattro_QQQ:25eV, Ion Trap, qTof, Orbitrap, Quattro_QQQ:10eV Isolated, Commercial, isolated
Indole-3-propionic acid GOLXRNDWAUTYKT-UHFFFAOYSA-N 3744 C11H11NO2 189.079 39 49 M+H, M-H2O+H, M+H-H2O ESI, Positive, LC-ESI Quattro_QQQ:40eV, Quattro_QQQ:25eV, Ion Trap, qTof, Orbitrap, Quattro_QQQ:10eV Isolated, Commercial, isolated
4-hydroxyphenyllactic acid JVGVDSSUAVXRDY-UHFFFAOYSA-N 9378 C9H10O4 182.0579 22 53 M-H2O+H, M+H, [M+H]+, [M+H], M+Na ESI, Positive QQQ, Quattro_QQQ:40eV, Quattro_QQQ:25eV, qTof, Flow-injection QqQ/MS, Orbitrap, Quattro_QQQ:10eV, LC-Q-TOF/MS Isolated, Commercial, isolated
Hippuric acid QIAFMBKCNZACKA-UHFFFAOYSA-N 464 C9H9NO3 179.0582 90 105 M+H, [M+H]+, 2M+H, [M+H], M+Na ESI, LC-ESI QQQ, Q-Exactive Plus, Ion Trap, qTof, Flow-injection QqQ/MS, Orbitrap Isolated, Commercial standard, isolated, Crude, Commercial
Serotonin QZAYGJVTTNCVMB-UHFFFAOYSA-N 5202 C10H12N2O 176.095 85 7 M+H, [M+H-H2O]+, [M+Na]+, [M+K]+, [M+H]+, 2M+H, [M+H], M+Na ESI, Positive, LC-ESI QQQ, Q-Exactive Plus, Quattro_QQQ:40eV, Quattro_QQQ:25eV, qTof, Flow-injection QqQ/MS, Orbitrap, Quattro_QQQ:10eV, LC-ESI-QTOF, LC-Q-TOF/MS , Isolated, Commercial standard, isolated, Crude, Commercial
3-3-PPA QVWAEZJXDYOKEH-UHFFFAOYSA-N 91 C9H10O3 166.063 0 18
D-phenyllactic acid VOXXWSYKYCBWHO-QMMMGPOBSA-N 444718 C9H10O3 166.063 22 68 M+H, M+K, [M+H], [M]+ ESI, Positive Quattro_QQQ:40eV, Quattro_QQQ:25eV, qTof, Flow-injection QqQ/MS, Orbitrap, Quattro_QQQ:10eV Crude, Isolated, Commercial, isolated
Glutamate WHUUTDBJXJRKMK-VKHMYHEASA-N 33032 C5H9NO4 147.0532 196 86 M+H, [M+H-H2O]+, [M+H]+, 2M+H, [M+H], M-H+2Na, M+H-H2O ESI, DI-ESI, Positive, LC-ESI QQQ, Quattro_QQQ:40eV, LC-ESI-ITFT, Quattro_QQQ:25eV, qTof, Flow-injection QqQ/MS, Orbitrap, LC-ESI-QQ, Quattro_QQQ:10eV, LC-ESI-QTOF, LC-Q-TOF/MS Lysate, Isolated, isolated, Crude, Commercial
alpha-hydroxyisocaproate LVRFTAZAXQPQHI-UHFFFAOYSA-N 92779 C6H12O3 132.0786 7 28 M+H, [M+H] ESI, Positive Quattro_QQQ:40eV, Quattro_QQQ:25eV, Flow-injection QqQ/MS, Orbitrap, Quattro_QQQ:10eV, LC-Q-TOF/MS Isolated, isolated
2-hydroxy-3-methylvalerate RILPIWOPNGRASR-RFZPGFLSSA-N 10796774 C6H12O3 132.0786 0 4
alpha-hydroxyisovalerate NGEWQZIDQIYUNV-UHFFFAOYSA-N 99823 C5H10O3 118.063 1 22 [M+H] ESI Flow-injection QqQ/MS Isolated
Succinate KDYFGRWQOYBRFD-UHFFFAOYSA-L 160419 C4H4O4-2 116.011 7 52 M+H, [M+H], M+H-H2O ESI, LC-ESI QQQ, qTof, Flow-injection QqQ/MS, Orbitrap, LC-Q-TOF/MS Isolated, Commercial, isolated
Histamine NTYJJOPFIAHURM-UHFFFAOYSA-N 774 C5H9N3 111.0796 50 3 M+H, [M+H], M+Na, [M+H]+ ESI, Positive, LC-ESI QQQ, Q-Exactive Plus, Quattro_QQQ:40eV, Quattro_QQQ:25eV, qTof, Flow-injection QqQ/MS, Orbitrap, Quattro_QQQ:10eV, LC-Q-TOF/MS Isolated, Commercial standard, Commercial, isolated
3-hydroxybutyric acid WHBMMWSBFZVSSR-UHFFFAOYSA-N 441 C4H8O3 104.0473 11 43 M+H, [M+H], [M+H]+ ESI QQQ, LC-ESI-IT, qTof, Flow-injection QqQ/MS, Orbitrap, LC-Q-TOF/MS Isolated, isolated
GABA BTCSSZJGUNDROE-UHFFFAOYSA-N 119 C4H9NO2 103.0633 61 16 M+H, [M+H]+, [M+H], M-H+2Na, M+H-H2O ESI, Positive, LC-ESI Q-Exactive Plus, QQQ, LC-ESI-ITFT, Quattro_QQQ:40eV, Quattro_QQQ:25eV, qTof, Flow-injection QqQ/MS, Orbitrap, LC-ESI-QQ, Quattro_QQQ:10eV, LC-ESI-QTOF, LC-Q-TOF/MS Isolated, Commercial standard, Commercial, isolated
Lactic acid JVTAAEKCZFNVCJ-UHFFFAOYSA-N 612 C3H6O3 90.0317 10 34 M+H, [M+H] ESI LC-Q-TOF/MS, Orbitrap, QQQ, Flow-injection QqQ/MS Isolated
TMAO UYPYRKYUKCHHIB-UHFFFAOYSA-N 1145 C3H9NO 75.0684 27 0 M+H, [M+H], [M+H]+ ESI, Positive QQQ, Quattro_QQQ:40eV, Quattro_QQQ:25eV, qTof, Flow-injection QqQ/MS, Orbitrap, LC-ESI-QQ, Quattro_QQQ:10eV, LC-ESI-QTOF, LC-Q-TOF/MS Isolated, isolated
Propionate XBDQKXXYIPTUBI-UHFFFAOYSA-M 104745 C3H5O2- 73.029 381 6 M+H, [M+Na]+, [M+H]+ ESI, Positive Quattro_QQQ:10eV, Quattro_QQQ:25eV, Orbitrap, Quattro_QQQ:40eV Isolated, Commercial standard
TMA GETQZCLCWQTVFV-UHFFFAOYSA-N 1146 C3H9N 59.0735 9 0 M+H ESI, Positive Quattro_QQQ:25eV, Orbitrap, Quattro_QQQ:10eV, Quattro_QQQ:40eV Isolated
Acetate QTBSBXVTEAMEQO-UHFFFAOYSA-M 175 C2H3O2- 59.0133 3 11 M+H Positive Quattro_QQQ:10eV, Quattro_QQQ:25eV, Quattro_QQQ:40eV Isolated
Show the code
from scripts.search_mzxml_for_ms2_by_precursor import main as search_mzxml_for_ms2_by_precursor
from scripts.metabolites_of_interest_gnps_library_lookup import create_df as create_metabolites_of_interest_gnps_library_lookup
from scripts.compare_search_results_to_gnps_library import run_in_parallel, load_gnps_library
from pathlib import Path
import os
import pickle 
import ast
import pickle
from scripts.mirror_plot import mirror_plot, plot_gnps_hits
import pandas as pd
import plotly.io as pio
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 0) # Adjust display width if needed

# Get environment variables or use default values
TOP_DIR = os.getenv("TOP_DIR", ".")
CALC_DIR = Path(os.getenv("CALC_DIR", Path(TOP_DIR, "data", "calculated"))).resolve()
INTERMEDIATE_DIR = Path(os.getenv("INTERMEDIATE_DIR", Path(CALC_DIR, "intermediate"))).resolve()
MSV000084900_DIR = Path(os.getenv("MSV000084900_DIR", Path(TOP_DIR, "data/MSV000084900/v02"))).resolve()
GNPS2_RESULTS_DIR = Path(os.getenv("GNPS2_RESULTS_DIR", Path(TOP_DIR, "data/gnps2_results/18d82f1067e643538adf9b73147122c3"))).resolve()

CALC_DIR.mkdir(parents=True, exist_ok=True)
INTERMEDIATE_DIR.mkdir(parents=True, exist_ok=True)

# Check the directories
def check_top_dir(TOP_DIR=TOP_DIR):
    TOP_DIR = Path(TOP_DIR).resolve()
    if not TOP_DIR.exists():
        raise FileNotFoundError(f"Directory {TOP_DIR} does not exist.")
    with open(Path(TOP_DIR, "topdirvalidator"), "r") as f:
        line = f.readline()
        if line != "4a1b9837696299d80967925a0e49fff8":
            raise ValueError(f"Unexpected TOP_DIR: {TOP_DIR} ")   

def check_msv000084900_dir(MSV000084900_DIR=MSV000084900_DIR):
    if not MSV000084900_DIR.exists():
        raise FileNotFoundError(f"Directory {MSV000084900_DIR} does not exist.")

def check_gnps2_results_dir(GNPS2_RESULTS_DIR=GNPS2_RESULTS_DIR):
    # Check that clusterinfo.tsv exists
    if not Path(GNPS2_RESULTS_DIR, "nf_output/clustering/clusterinfo.tsv").exists():
        raise FileNotFoundError(f"Directory {GNPS2_RESULTS_DIR} does not exist.")

check_top_dir(TOP_DIR)
check_msv000084900_dir(MSV000084900_DIR)
check_gnps2_results_dir(GNPS2_RESULTS_DIR)

MOLECULE_OF_INTEREST=['2-hydroxy-3-methylvalerate','3-3-PPA','3-hydroxybutyric acid','hyodeoxycholic acid','4-hydroxyphenyllactic acid','Acetate','alpha-hydroxyisocaproate','Conjugated linoleic acid','D-phenyllactic acid','GABA','Glutamate','Hippuric acid','Histamine','Indole-3-lactic acid','Indole-3-propionic acid','Indopropionic acid','Lactic acid','Propionate','Serotonin','Succinate','TMAO','TMA','Tryptophan','alpha-hydroxyisovalerate']

Find all GNPS library entries that match the metabolites of interest

Given the names and PubChem CIDs of the metabolites of interest, find all matching entries in the GNPS public library via planar InChI Key.

Show the code
create_metabolites_of_interest_gnps_library_lookup(
  json_path = Path(TOP_DIR, 'data', 'gnpslibrary.json'),
  outpath = Path(INTERMEDIATE_DIR, 'metabolites_of_interest_gnps_library_lookup.tsv')
)

GNPS2 Library Matches

Search by planar InChIKey

The results displayed are only the matches where the metabolite of interest has a corresponding entry in the GNPS library with the same planar InChIKey.

The tables are filtered results form spectra clustering by GNPS2 (nf_output/networking/clustersummary_with_network.tsv). Mirror plots are the GNPS library spectrum and the first entry in the GNPS2 cluster.

Show the code
from IPython.display import display, HTML

df = pd.read_csv(Path(INTERMEDIATE_DIR, 'metabolites_of_interest_gnps_library_lookup.tsv'), sep='\t')
gnps_result_df = pd.read_csv(Path(GNPS2_RESULTS_DIR, 'nf_output/networking/clustersummary_with_network.tsv'), sep='\t', low_memory=False)
df['submatch_positive'] = df['submatch_positive'].apply(ast.literal_eval)
df['submatch_negative'] = df['submatch_negative'].apply(ast.literal_eval)
df['gnps_ids'] = df.apply(lambda row: [i['spectrum_id'] for i in row['submatch_positive']] + 
                                   [i['spectrum_id'] for i in row['submatch_negative']], axis=1)
df.drop(columns=['submatch_positive', 'submatch_negative'], inplace=True)

no_matches = []
for _, row in df.iterrows():
    tdf = gnps_result_df[gnps_result_df['SpectrumID'].isin(row['gnps_ids'])]
    if tdf.empty:
        no_matches.append(row['metabolite_of_interest'])
    else:
      print(f"### {row['metabolite_of_interest']}\n")
      display(HTML(tdf.to_html(index=False, classes='table')))
      if tdf.iloc[0]['SpectrumID'] not in MGF_DATA_DICT:
          print(f"Warning: SpectrumID {tdf.iloc[0]['SpectrumID']} not found in MGF_DATA_DICT\n")
          continue
      fig = plot_gnps_hits(
          tdf=tdf,
          mgf=MGF_DATA_DICT,
          GNPS2_RESULTS_DIR=GNPS2_RESULTS_DIR,
          MSV000084900_DIR=MSV000084900_DIR,
      )
      fig.show()

# Print metabolites with no matches
print("- **No matches found for the following metabolites:**")
for metabolite in no_matches:
    print(f"  - {metabolite}")

Indole-3-lactic acid

cluster index number of spectra parent mass precursor charge precursor mass sum(precursor intensity) RTMean component SpectrumID #Scan# SpectrumFile LibraryName MQScore TIC_Query RT_Query MZErrorPPM SharedPeaks MassDiff SpecMZ SpecCharge FileScanUniqueID NumberHits Compound_Name Ion_Source Instrument Compound_Source PI Data_Collector Adduct Precursor_MZ ExactMass Charge CAS_Number Pubmed_ID Smiles INCHI INCHI_AUX Library_Class IonMode Organism LibMZ UpdateWorkflowName LibraryQualityString tags molecular_formula InChIKey InChIKey-Planar superclass class subclass npclassifier_superclass npclassifier_class npclassifier_pathway library_usi
2627 27 206.081 0 206.081 429935.0 3.290781 -1 CCMSLIB00003134667 2627.0 specs_ms.mgf GNPS-NIST14-MATCHES.mgf 0.882585 23046.3 0.0 38.8709 5.0 0.008011 206.081 0.0 temp/specs_ms.mgf2627 1.0 Spectral Match to DL-Indole-3-lactic acid from NIST14 ESI qTof Isolated Data from McKerrow Data deposited by lamccall M+H 206.089 205.074 1.0 832973 NaN C1=CC=C2C(=C1)C(=CN2)CC(C(=O)O)O InChI=1S/C11H11NO3/c13-10(11(14)15)5-7-6-12-9-4-2-1-3-8(7)9/h1-4,6,10,12-13H,5H2,(H,14,15) NaN 3.0 Positive GNPS-NIST14-MATCHES 206.089 UPDATE-SINGLE-ANNOTATED-BRONZE Bronze NaN C11H11NO3 XGILAAMKEQUXLS-UHFFFAOYSA-N XGILAAMKEQUXLS Organoheterocyclic compounds Indoles and derivatives Indolyl carboxylic acids and derivatives Tryptophan alkaloids Simple indole alkaloids Alkaloids mzspec:GNPS:GNPS-LIBRARY:CCMSLIB00003134667

Tryptophan

cluster index number of spectra parent mass precursor charge precursor mass sum(precursor intensity) RTMean component SpectrumID #Scan# SpectrumFile LibraryName MQScore TIC_Query RT_Query MZErrorPPM SharedPeaks MassDiff SpecMZ SpecCharge FileScanUniqueID NumberHits Compound_Name Ion_Source Instrument Compound_Source PI Data_Collector Adduct Precursor_MZ ExactMass Charge CAS_Number Pubmed_ID Smiles INCHI INCHI_AUX Library_Class IonMode Organism LibMZ UpdateWorkflowName LibraryQualityString tags molecular_formula InChIKey InChIKey-Planar superclass class subclass npclassifier_superclass npclassifier_class npclassifier_pathway library_usi
2550 2036 205.097 0 205.097 157104000.0 0.951604 3985 CCMSLIB00005778066 2550.0 specs_ms.mgf MASSBANK.mgf 0.966012 240801.0 0.0 0.0 6.0 0.0 205.097 0.0 temp/specs_ms.mgf2550 1.0 Massbank:FIO00630 Tryptophan ESI qTof Isolated Massbank Massbank M+H 205.097 0.0 1.0 73-22-3 NaN C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)N 1S/C11H12N2O2/c12-9(11(14)15)5-7-6-13-10-4-2-1-3-8(7)10/h1-4,6,9,13H,5,12H2,(H,14,15)/t9-/m0/s1 NaN 3.0 Positive MASSBANK 205.097 UPDATE-SINGLE-ANNOTATED-BRONZE Bronze NaN C11H12N2O2 QIVBCDIJIAJPQS-VIFPVBQESA-N QIVBCDIJIAJPQS Organoheterocyclic compounds Indoles and derivatives Indolyl carboxylic acids and derivatives Small peptides Aminoacids Amino acids and Peptides|Shikimates and Phenylpropanoids mzspec:GNPS:GNPS-LIBRARY:CCMSLIB00005778066

Hippuric acid

cluster index number of spectra parent mass precursor charge precursor mass sum(precursor intensity) RTMean component SpectrumID #Scan# SpectrumFile LibraryName MQScore TIC_Query RT_Query MZErrorPPM SharedPeaks MassDiff SpecMZ SpecCharge FileScanUniqueID NumberHits Compound_Name Ion_Source Instrument Compound_Source PI Data_Collector Adduct Precursor_MZ ExactMass Charge CAS_Number Pubmed_ID Smiles INCHI INCHI_AUX Library_Class IonMode Organism LibMZ UpdateWorkflowName LibraryQualityString tags molecular_formula InChIKey InChIKey-Planar superclass class subclass npclassifier_superclass npclassifier_class npclassifier_pathway library_usi
659 10 180.066 0 180.066 2727860.0 1.456516 -1 CCMSLIB00005883437 659.0 specs_ms.mgf GNPS-LIBRARY.mgf 0.965707 9691.8 0.0 0.0 5.0 0.0 180.066 0.0 temp/specs_ms.mgf659 1.0 HIPPURATE - 30.0 eV ESI Orbitrap Isolated Madeleine Ernst Anna Abrahamsson M+H 180.066 0.0 1.0 495-69-2 NaN OC(=O)CNC(=O)C1=CC=CC=C1 InChI=1S/C9H9NO3/c11-8(12)6-10-9(13)7-4-2-1-3-5-7/h1-5H,6H2,(H,10,13)(H,11,12) NaN 1.0 Positive GNPS-LIBRARY 180.066 UPDATE-SINGLE-ANNOTATED-GOLD Gold NaN C9H9NO3 QIAFMBKCNZACKA-UHFFFAOYSA-N QIAFMBKCNZACKA Benzenoids Benzene and substituted derivatives Benzoic acids and derivatives Small peptides Aminoacids Amino acids and Peptides mzspec:GNPS:GNPS-LIBRARY:CCMSLIB00005883437

Serotonin

cluster index number of spectra parent mass precursor charge precursor mass sum(precursor intensity) RTMean component SpectrumID #Scan# SpectrumFile LibraryName MQScore TIC_Query RT_Query MZErrorPPM SharedPeaks MassDiff SpecMZ SpecCharge FileScanUniqueID NumberHits Compound_Name Ion_Source Instrument Compound_Source PI Data_Collector Adduct Precursor_MZ ExactMass Charge CAS_Number Pubmed_ID Smiles INCHI INCHI_AUX Library_Class IonMode Organism LibMZ UpdateWorkflowName LibraryQualityString tags molecular_formula InChIKey InChIKey-Planar superclass class subclass npclassifier_superclass npclassifier_class npclassifier_pathway library_usi
594 6 177.102 0 177.102 249916.0 0.471619 -1 CCMSLIB00003134728 594.0 specs_ms.mgf GNPS-NIST14-MATCHES.mgf 0.807959 5647.54 0.0 0.0 6.0 0.0 177.102 0.0 temp/specs_ms.mgf594 1.0 Spectral Match to Serotonin from NIST14 ESI qTof Isolated Data from Jairam KP Vanamala, PhD Data deposited by mjmeehan M+H 177.102 176.095 1.0 50679 NaN C1=CC2=C(C=C1O)C(=CN2)CCN InChI=1S/C10H12N2O/c11-4-3-7-6-12-10-2-1-8(13)5-9(7)10/h1-2,5-6,12-13H,3-4,11H2 NaN 3.0 Positive GNPS-NIST14-MATCHES 177.102 UPDATE-SINGLE-ANNOTATED-BRONZE Bronze NaN C10H12N2O QZAYGJVTTNCVMB-UHFFFAOYSA-N QZAYGJVTTNCVMB Organoheterocyclic compounds Indoles and derivatives Tryptamines and derivatives Tryptophan alkaloids Simple indole alkaloids Alkaloids mzspec:GNPS:GNPS-LIBRARY:CCMSLIB00003134728
  • No matches found for the following metabolites:
    • hyodeoxycholic acid
    • Conjugated linoleic acid
    • Indopropionic acid
    • Indole-3-propionic acid
    • 4-hydroxyphenyllactic acid
    • 3-3-PPA
    • D-phenyllactic acid
    • Glutamate
    • alpha-hydroxyisocaproate
    • 2-hydroxy-3-methylvalerate
    • Succinate
    • Histamine
    • 3-hydroxybutyric acid
    • GABA
    • Lactic acid
    • TMAO
    • Propionate
    • TMA
    • Acetate
    • alpha-hydroxyisovalerate

Manual examination of GNPS2 search results

CCMSLIB00003138284 371 specs_ms.mgf GNPS-NIST14-MATCHES.mgf 0.863861 10861.2 0 0 5 0 165.055 0 temp/specs_ms.mgf371 1 Spectral Match to p-Hydroxyphenyllactic acid from NIST14 CCMSLIB00010010644 8345 specs_ms.mgf ECG-ACYL-AMIDES-C4-C24-LIBRARY.mgf 0.986215 8975.94 0 4.16397 6 0.000991821 238.19 0 temp/specs_ms.mgf8345 1 histamine-C8:0

Show the code
gnps_result_df = pd.read_csv(Path(GNPS2_RESULTS_DIR, 'nf_output/networking/clustersummary_with_network.tsv'), sep='\t', low_memory=False)
tdf = gnps_result_df[gnps_result_df['SpectrumID'].isin(['CCMSLIB00003138284'])]
print("### 4-hydroxyphenyllactic acid")
display(HTML(tdf.to_html(index=False, classes='table')))
fig = plot_gnps_hits(
          tdf=tdf,
          mgf=MGF_DATA_DICT,
          GNPS2_RESULTS_DIR=GNPS2_RESULTS_DIR,
          MSV000084900_DIR=MSV000084900_DIR,
      )
fig.show()

4-hydroxyphenyllactic acid

cluster index number of spectra parent mass precursor charge precursor mass sum(precursor intensity) RTMean component SpectrumID #Scan# SpectrumFile LibraryName MQScore TIC_Query RT_Query MZErrorPPM SharedPeaks MassDiff SpecMZ SpecCharge FileScanUniqueID NumberHits Compound_Name Ion_Source Instrument Compound_Source PI Data_Collector Adduct Precursor_MZ ExactMass Charge CAS_Number Pubmed_ID Smiles INCHI INCHI_AUX Library_Class IonMode Organism LibMZ UpdateWorkflowName LibraryQualityString tags molecular_formula InChIKey InChIKey-Planar superclass class subclass npclassifier_superclass npclassifier_class npclassifier_pathway library_usi
371 12 165.055 0 165.055 1055820.0 1.524683 7747 CCMSLIB00003138284 371.0 specs_ms.mgf GNPS-NIST14-MATCHES.mgf 0.863861 10861.2 0.0 0.0 5.0 0.0 165.055 0.0 temp/specs_ms.mgf371 1.0 Spectral Match to p-Hydroxyphenyllactic acid from NIST14 ESI HCD Isolated Data from Jairam KP Vanamala, PhD Data deposited by mjmeehan M+H-H2O 165.055 0.0 1.0 306230 NaN NaN NaN NaN 3.0 Positive GNPS-NIST14-MATCHES 165.055 UPDATE-SINGLE-ANNOTATED-BRONZE Bronze NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN mzspec:GNPS:GNPS-LIBRARY:CCMSLIB00003138284
Show the code
gnps_result_df = pd.read_csv(Path(GNPS2_RESULTS_DIR, 'nf_output/networking/clustersummary_with_network.tsv'), sep='\t', low_memory=False)
tdf = gnps_result_df[gnps_result_df['SpectrumID'].isin(['CCMSLIB00010010644'])]
print('### Histamine-C8:0')
display(HTML(tdf.to_html(index=False, classes='table')))
fig = plot_gnps_hits(
          tdf=tdf,
          mgf=MGF_DATA_DICT,
          GNPS2_RESULTS_DIR=GNPS2_RESULTS_DIR,
          MSV000084900_DIR=MSV000084900_DIR,
      )
fig.show()

Histamine-C8:0

cluster index number of spectra parent mass precursor charge precursor mass sum(precursor intensity) RTMean component SpectrumID #Scan# SpectrumFile LibraryName MQScore TIC_Query RT_Query MZErrorPPM SharedPeaks MassDiff SpecMZ SpecCharge FileScanUniqueID NumberHits Compound_Name Ion_Source Instrument Compound_Source PI Data_Collector Adduct Precursor_MZ ExactMass Charge CAS_Number Pubmed_ID Smiles INCHI INCHI_AUX Library_Class IonMode Organism LibMZ UpdateWorkflowName LibraryQualityString tags molecular_formula InChIKey InChIKey-Planar superclass class subclass npclassifier_superclass npclassifier_class npclassifier_pathway library_usi
8345 9 238.19 0 238.19 1325900.0 4.124967 9461 CCMSLIB00010010644 8345.0 specs_ms.mgf ECG-ACYL-AMIDES-C4-C24-LIBRARY.mgf 0.986215 8975.94 0.0 4.16397 6.0 0.000992 238.19 0.0 temp/specs_ms.mgf8345 1.0 histamine-C8:0 ESI qTof Crude Dorrestein Emily Gentry M+H 238.191 237.184 1.0 NaN NaN CCCCCCCC(NCCC1=NC=CN1)=O InChI=1S/C13H23N3O/c1-2-3-4-5-6-7-13(17)16-9-8-12-14-10-11-15-12/h10-11H,2-9H2,1H3,(H,14,15)(H,16,17) NaN 2.0 Positive ECG-ACYL-AMIDES-C4-C24-LIBRARY 238.191 UPDATE-SINGLE-ANNOTATED-SILVER Silver NaN C13H23N3O OFFGMPVWGIKLDD-UHFFFAOYSA-N OFFGMPVWGIKLDD Lipids and lipid-like molecules Fatty Acyls Fatty amides Histidine alkaloids Imidazole alkaloids Alkaloids mzspec:GNPS:GNPS-LIBRARY:CCMSLIB00010010644